In [ ]:
import sys
import re
Define the extract_names() function below and change baby_names() to call it.
For writing regex, it's nice to include a copy of the target text for inspiration.
Here's what the html looks like in the baby.html files:
Suggested milestones for incremental development:
In [ ]:
def extract_names(filename):
"""
Given a file name for baby.html, returns a list starting with the year string
followed by the name-rank strings in alphabetical order.
['2006', 'Aaliyah 91', Aaron 57', 'Abagail 895', ' ...]
"""
# +++your code here+++
return
In [ ]:
def baby_names(file_list, summary=False):
# +++your code here+++
# For each filename, get the names, then either print the text output
# or write it to a summary file
In [ ]:
baby_names(['data/babynames/baby1990.html'])
In [ ]:
wordcount('topcount', 'data/wiki.txt')
In [ ]:
baby_names(['data/babynames/baby1996.html'], summary=True)
In [ ]:
baby_names(['data/babynames/baby2000.html', 'data/babynames/baby2002.html'])
In [ ]:
In [ ]:
In [ ]:
Note: This notebook is an adaption of Google's python tutorial https://developers.google.com/edu/python